Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!
Ant Group open-sources dInfer, the first high-performance inference framework for diffusion language models in the industry, significantly improving inference speed. Benchmark tests show that it is 10.7 times faster than NVIDIA Fast-dLLM, achieving 1011 Tokens per second in single inference on the HumanEval code generation task, pushing technology toward practical applications.